Research Article 



Human Mutation 



OFFICIAL JOURNAL 



Transcriptional Hallmarks of Noonan Syndrome and 
Noonan-Like Syndrome with Loose Anagen Hair 



HGVll 



HUMAN GENOME 
VARIATION SOCIETY 

www.hgvs.org 



Giovanni Battista Ferrero, 1 Gabriele Picco, 2 ' 3 Giuseppina Baldassarre, 1 Elisabetta Flex, 4 Claudio Isella, 2 ' 3 
Daniela Cantarella, 2 ' 3 Davide Cora, 5 Nicoletta Chiesa, 1 Nicoletta Crescenzio, 1 Fabio Timeus, 1 Giuseppe Merla, 6 
Laura Mazzanti, 7 Giuseppe Zampino, 8 Cesare Rossi, 9 Margherita Silengo, 1 Marco Tartaglia, 4 and Enzo Medico 2,3 * 

'Department of Pediatrics, University of Torino Medical School, Torino, Italy; 2 Department of Oncological Sciences, University of Torino Medical 
School, Torino, Italy; 3 Laboratory of Oncogenomics, Institute for Cancer Research and Treatment, Candiolo (Torino), Italy; 4 Dipartimento di 
Ematologia, Oncologia e Medicina Molecolare, Istituto Superiore di Sanita, Roma, Italy; 5 Laboratory of Systems Biology, Institute for Cancer 
Research and Treatment fIRCC), 10060 Candiolo (Torino), Italy; 6 Medical Genetics Unit, IRCCS 'Casa Sollievo delta Sofferenza', S. Giovanni 
Rotondo, Italy; 7 Dipartimento di Pediatria, Universita degli Studi di Bologna, Bologna, Italy; 8 Istituto di Clinica Pediatrica, Universita Cattolica del 
Sacro Cuore, Roma, Italy; 9 U0 Genetica Medica, Policlinico S.Orsola-Malpighi, Bologna, Italy 

Communicated by Nancy B. Spinner 

Received 9 August 2011; accepted revised manuscript 4 January 2012. 

Published online 17 January 2012 in Wiley Online Library (www.wiley.com/humanmutation). DOI: 10.1002/humu. 22026 



ABSTRACT: Noonan syndrome (NS) is among the most 
common nonchromosomal disorders affecting develop- 
ment and growth. NS is genetically heterogeneous, be- 
ing caused by germline mutations affecting various genes 
implicated in the RAS signaling network. This network 
transduces extracellular signals into intracellular biochem- 
ical and transcriptional responses controlling cell prolifer- 
ation, differentiation, metabolism, and senescence. To ex- 
plore the transcriptional consequences of NS-causing mu- 
tations, we performed global mRNA expression profiling 
on peripheral blood mononuclear cells obtained from 23 
NS patients carrying heterozygous mutations in PTPN1 1 
or SOS1. Gene expression profiling was also resolved in 
five subjects with Noonan-like syndrome with loose ana- 
gen hair (NS/LAH), a condition clinically related to NS 
and caused by an invariant mutation in SHOC2. Robust 
transcriptional signatures were found to specifically dis- 
criminate each of the three mutation groups from 2 1 age- 
and sex-matched controls. Despite the only partial overlap 
in terms of gene composition, the three signatures showed 
a notable concordance in terms of biological processes 
and regulatory circuits affected. These data establish ex- 
pression profiling of peripheral blood mononuclear cells 
as a powerful tool to appreciate differential perturbations 
driven by germline mutations of transducers involved in 
RAS signaling and to dissect molecular mechanisms un- 
derlying NS and other RASopathies. 
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Introduction 

Dysregulation of RAS signaling has recently been recognized to 
underlie a group of clinically related disorders affecting develop- 
ment and growth [Schubbert et al., 2007; Tartaglia and Gelb, 2010; 
Tidyman and Rauen, 2009] . Most of these conditions, which are 
collectively named as RASopathies, share facial dysmorphism, a 
wide spectrum of heart disease, reduced postnatal growth, vari- 
able cognitive defects, and susceptibility to certain malignancies. 
In these Mendelian traits, heterozygous germline mutations affect 
various genes coding for members of the small subfamily of RAS 
GTPases, signal relay proteins that function as modulators of RAS 
function, RAS effectors, and downstream signal transducers. De- 
spite the majority of mutations appear to enhance signal traffic 
through the RAS-mitogen-activated protein kinase (MAPK) axis, 
each syndrome maintains indeed distinctive phenotypic features. 
In some of these disorders, a further level of complexity is due to 
genetic heterogeneity, which explains, in part, the observed clini- 
cal variability. Noonan syndrome (NS, OMIM 163950), which is 
the most common among these diseases, occurring approximately 
in 1:1000-1:2500 live births, represents a paradigmatic condition 
[Allanson, 2007; Tartaglia et al., 2010; Van der Burgt, 2007]. NS is 
genetically heterogeneous, with activating mutations in PTPN11, 
SOS1, KRAS, NRAS, RAF1, and BRAF occurring in approximately 
75% of affected individuals [Tartaglia et al., 2011]. NS is a clini- 
cally variable disorder, and recent studies have established clinically 
relevant genotype-phenotype correlations, such as a high preva- 
lence of pulmonic stenosis among subjects with a mutated PTPN1 1 
allele, occurrence of hypertrophic cardiomyopathy in individuals 
heterozygous for a mutation in RAF1, or generally normal growth 
and cognition in subjects carrying a mutated SOS1 gene [Tartaglia 
et al., 2010]. In contrast to what observed in NS, other RASopathies 
exhibit a relatively homogeneous phenotype that generally reflects 
an underlying genetic homogeneity. This is the case of Noonan- 
like syndrome with loose anagen hair (NL/LAH), a rare condition 
with clinical features partially overlapping those occurring in NS 
[Mazzanti et al., 2003] , and caused by the invariant c.4A>G missense 
change (p.Ser2Gly) in SHOC2 [Cordeddu et al., 2009], a scaffold 
protein with regulatory function that positively modulate RAS sig- 
naling [Matsunaga-Udagawa et al., 2010; Rodriguez-Viciana et al., 
2006]. 
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To provide first insights on the pathogenetic mechanisms under- 
lying NS and related RASopathies, a number of studies have been 
directed to investigate the consequences of panels of disease-causing 
mutations on protein structure and function, and their perturbing 
effects on intracellular signaling [Tartaglia et al., 2010]. No attempt 
has been directed, however, to investigate the consequences of the 
aberrant activation of the RAS signaling network driven by the 
different disease-causing molecular lesions on the control of gene 
expression. Here, we explored the global gene expression profile of 
peripheral blood mononuclear cells (PBMCs) collected from two 
cohorts of subjects with mutations in the two most common NS 
disease genes (PTPN11 and SOS1), and a third group representative 
ofthe genetically homogeneous NS/LAH (SHOC2) in order to iden- 
tify transcriptional signatures specifically associated with aberrant 
PTPN1 1/SHP2, SOS1, and SHOC2 function, as well as to evaluate 
the extent and branching of intracellular signaling dysregulation 
associated with these specific pathological conditions. 

Methods 

Patients Selection 

The study was approved by the Local Ethics Committee of the 
Regina Margherita Childrens' Hospital, Torino, Italy. Informed con- 
sent was obtained from parents or guardians of all participants. 
Patients were enrolled in the study between March 2006 and May 
2008. Controls are children with staturo-ponderal and neuromotor 
development within normal limits. The diagnosis of NS was estab- 
lished according to Van der Burgt clinical criteria [van der Burgt 
et al., 1994] and confirmed by molecular analysis on genomic DNA 
isolated from 200 fi\ of peripheral blood by the QIAamp DNA 
Blood Mini Kit (Qiagen, Hilden, Germany). The 15 coding exons 
and exon/intron junctions of PTPN1 1 were amplified by PCR with 
FastStart Taq DNA Polymerase (Roche Diagnostics Corporation, 
Indianapolis, IN) under standard conditions with the primers listed 
in Tartaglia et al. [2002], SOS1 analysis was carried out by am- 
plification and sequencing of the 23 exons as previously described 
[Tartaglia et al., 2007] and SHOC2 gene was studied as reported in 
Cordeddu et al. [2009]. The study cohort included 23 subjects with 
a diagnosis of NS associated with a germline mutation in PTPN1 1 
(N = 17) or SOS1 (N = 6), and five individuals with NS/LAH due 
to the invariant c.4A>G missense change in SHOC2. Mutation data 
(location of affected residues and type of amino acid substitution) 
are resumed in Supp. Table SI. Additional 21 samples were ob- 
tained from age- and sex-matched controls. Informed consent was 
obtained from all subjects included in the study. 

RNA Extraction and Processing for Microarray 

RNA was extracted from PBMCs, isolated form fresh blood sam- 
ples within 2 hr from collection, using the TRIzol Plus RNA pu- 
rification system (Invitrogen Corp., Carlsbad, CA) according to the 
manufacturer's protocol. The quantification and quality analysis 
of RNA was performed on a Bioanalyzer 2100 (Agilent Technolo- 
gies, Palo Alto, CA). Synthesis of cDNA and biotinylated cRNA 
was performed using the Illumina TotalPrep RNA Amplification Kit 
(Ambion, Foster City, CA; Cat. n. IL1791), according to the manu- 
facturer's protocol. Quality assessment and quantification of cRNAs 
were performed with Agilent RNA kits on Bioanalyzer 2100. Hy- 
bridization of cRNAs (750 ng) was carried out on HumanRef8_V2 
BeadChips (Illumina Inc., San Diego, CA). Array washing was per- 
formed using Illumina High stringency wash buffer for 10 min at 



55°C, followed by staining using streptavidin-Cy3 dyes (Amersham 
Biosciences, Buckinghamshire, UK), according to standard Illumina 
protocols. 

Data Analysis 

Cubic spline-normalized probe intensity data, together with de- 
tection P- values, were obtained using the BeadStudio 3.1 software 
(Illumina). Subsequent data processing, carried out with Excel (Mi- 
crosoft Corp., Redmond, WA) included: (1) scaling, Log 2 transfor- 
mation, and detection filtering; (2) removal of genes correlated with 
age, sex, or differential leukocyte count; (3) Log 2 Ratio transforma- 
tion and selection of genes differentially expressed between controls 
and mutated groups; (4) Monte carlo simulation for false discov- 
ery rate estimation; (5) full leave-one-out classification analysis. 
All procedures are described in detail in Supp. Methods. Log 2 Ratio 
expression data were clustered and visualized using the GEDAS soft- 
ware [Fu and Medico, 2007] . 

Results 

PBMC Gene Expression Profiling of NS and NS/LAH 
Patients 

PTPNll, SOS1, and SHOC2 gene expression in human PBMCs 
was preliminarily verified by in silico analysis on a published PBMC 
gene expression dataset [Burczynski et al., 2006] . The analysis in- 
dicated that these and other disease genes known to be implicated 
in RASopathies are expressed, at varying levels, in human PBMCs 
(Supp. Fig. SI). For gene expression profiling (GEP), we selected 23 
NS patients including 17 subjects carrying a mutation in PTPNll 
and six with a SOS1 lesion, five NS/LAH subjects with the invariant 
c.4A>G SHOC2 mutation, and 21 age- and sex-matched controls 
(Supp. Table SI). Total RNA extracted from PBMCs was processed 
for GEP on Illumina Beadarrays. We verified expression of PTPN1 1, 
SOS1, and SHOC2 mRNA in our samples by checking microarray 
probe signal intensities for PTPN1 1 and SOS1, and quantitative real- 
time PCR signals for SHOC2 that was not represented on the arrays 
(Supp. Fig. S2). Out ofthe 20,589 probes analyzed on the array, 5,605 
passed filtering for reliable signal detection and for not being cor- 
related with age, sex, or differential leukocyte count (Supp. Fig. S3 
and Supp. Methods). Unsupervised hierarchical clustering of all 
samples based on these probes revealed four major transcriptional 
subgroups, two of which were enriched, respectively, in NS/LAH 
and NS samples (Supp. Fig. S4). For supervised statistical detection 
of genes differentially expressed between NS and NS/LAH samples 
and controls, a multiple test including fold change (absolute Log 2 
ratio > 0.5), f-test (P < 0.01), and signal-to-noise ratio (SNR > 0.5; 
see also Supp. Methods) was applied to the following comparisons: 
(1) NS+NS/LAH samples versus controls; (2) PTPNll mutation- 
positive samples versus controls; (3) SOS1 mutation-positive sam- 
ples versus controls; and (4) SHOC2 mutation-positive samples ver- 
sus controls. Four signatures were thus identified, composed of 125, 
225, 73, and 1407 probes, respectively (Supp. Table S2). A Monte 
Carlo simulation considering 2,000 random sample permutations 
was performed that allowed estimating the fraction of false positive 
hits as acceptably low (0.001-5.5%; Supp.Table S3). 

NS and NS/LAH Gene-Specific Transcriptional Signatures 
in Human PBMCs 

Expression of genes belonging to the four signatures is shown in 
Figure 1 . Of note, the signatures obtained separately for the PTPN1 1 , 
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Figure 1. PBMC transcriptional signatures discriminating NS and 
NS/LAH patients from unaffected individuals. Heatmap representing 
Log2 ratio expression for gene probes (rows) across samples (columns). 
Higher than average (red) and lower than average (green) expression 
levels are indicated according to the color bar reported below the dia- 
gram. Samples are subdivided in four groups, from left to right: controls 
(C-001-C-021), NS with a mutated PTPN11 allele (PT-001-PT-017), NS 
with a S0S1 mutation (SO-001-SO-006), and NS/LAH with the c.4A>G 
change (SH-001-SH-005). Four major transcriptional signatures com- 
posed of genes that significantly discriminate control samples from 
(1) PTPN11, S0S1, and SH0C2 mutation-positive samples (NS+NS/LAH 
signature, 125 gene probes), (2) PTPN11 mutation-positive samples 
{PTPN11 signature, 225 probes), (3) S0S1 mutation-positive samples 
{S0S1 signature, 73 probes), and (4) SH0C2 mutation-positive samples 
(5W0C2signature, 1,407 probes) are shown. 



SOS1, and SHOC2 mutations were found to discriminate more ef- 
ficiently the individual mutation groups from controls, compared 
to the signature characterizing the entire "RASopathy" cohort of 
PBMCs with mutated PTPN1 1, SOS1, and SHOC2 alleles, indicating 
occurrence of significant heterogeneity among subgroups. Indeed, 
the three disease gene-specific signatures displayed detectable, but 
only partial overlaps (Supp. Table S2E). The signature character- 
izing NS/LAH was the largest, including 1,394 genes. Within this 
group, the expression profiles were highly homogeneous, possibly 
because of the invariant occurrence of the SHOC2 c.4A>G mutation 
underlying this disorder, and appeared to be oppositely modulated 
within both the PTPN11 and SOS1 mutation-associated NS groups. 
A robust signature, characterized by 223 differently expressed genes 
was also attained for the PTPN11 mutation group. This signature 
was shared, in part, with the SHOC2 mutation group, while it no- 
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Figure 2. Transcriptional signatures classify PBMCs from subjects 
with RASopathy and unaffected individuals. The four plots show the 
results of a full leave-one-out classification analysis. Briefly, each 
PBMC sample was left out of the dataset and received four clas- 
sification scores (/-axis, A-D) based on four signatures, fully con- 
structed on the remaining samples: (A) NS+NS/LAH signature; (B) 
PTPN11 mutation-associated signature; (C) S0S1 mutation-associated 
signature; (D) SH0C2 mutation-associated signature. Samples are sub- 
divided in four groups, as indicated on the x-axis, based on the genotype. 
Grey horizontal lines indicate optimal putative classification thresholds. 
Gene-specific signatures show high discriminating ability for the re- 
spective groups of samples. 



ticeably diverged in the SOS1 mutation group. Differently from 
what observed in the PTPN11 and SHOC2 mutation cohorts, the 
SOS1 mutation group shared a signature with restricted size, which, 
however, appeared to efficiently discriminate this group from con- 
trols. The SOS1 mutation-associated signature appeared oppositely 
modulated in the SHOC2 mutation group, while it was relatively 
conserved among samples of the PTPN1 1 mutation group. 

Overall, these findings document that the PTPN11, SOS1, and 
SHOC2 mutations induce detectable gene expression changes in 
PBMCs, and suggest that the specific perturbation in gene expres- 
sion modulation occurring within each subgroup cannot be simply 
ascribed to a differential perturbing effect of mutations in individual 
disease genes on the extent of signal flow through a common signal 
transduction pathway (i.e., the RAS-MAPK cascade). 

To verify whether the disease gene-specific signatures could re- 
liably distinguish samples with mutations in the respective disease 
genes from control samples in a diagnostic setting, we performed 
full leave-one-out cross-validation analysis. Briefly, each sample (ei- 
ther mutated or not) was individually removed from the dataset, and 
the remaining samples were used to select again significant genes 
and redefine the four signatures. The left-out sample was then clas- 
sified by calculating a weighted average score for each signature 
(NS+NS/LAH, PTPN11, SOS1, and SHOC2; see Supp. Methods). 
Finally, the four classification scores for each sample were displayed 
in dot plots (Fig. 2) and used for f-test-based statistics. Overall, the 
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RASopathy-associated score was significantly different between con- 
trol and NS+NS/LAH samples (f-test P- value < 0.001). It correctly 
classified all the SHOC2 samples as well as the majority of PTPN1 1 
samples (P < 0.001), but failed to discern SOSI mutation-positive 
samples from controls (P= 0.29). The PTPN11 mutation-associated 
score was documented to discriminate efficiently samples with a 
mutated PTPN1 1 allele from healthy controls (P < 0.00 1 ) and those 
with SOSI mutations (P < 0.005), but not from samples with mu- 
tated SHOC2 (P = 0.706). The SOS J mutation score maintained 
a significant discrimination efficacy against control samples (P < 
0.001), but displayed low specificity. Finally, the SHOC2 mutation 
score, despite being derived from just four samples in the "leave-one 
out" approach, displayed very good sensitivity and specificity against 
both controls (P < 0.001) and other mutated samples (SHOC2 vs 
PTPN11: P< 0.001; SHOC2vs SOSI P< 0.001). Overall, these results 
represent proof of concept that PBMC-derived transcriptional sig- 
natures are sufficiently robust to be considered as distinctive for each 
of the different conditions and to eventually be used for diagnostic 
purposes. 

In silico Data Mining Reveals Biological Significance 
of NS and NS/LAH Mutation-Specific Signatures 

To functionally characterize genes transcriptionally associated to 
NS and NS/LAH causative mutations, we tested the PTPNl 1, SOSI, 
and SHOC2 mutation-specific signatures for enrichment in func- 
tional annotation keywords using DAVID [Huang da W et al., 2009; 
see Supp. Methods). This analysis was conducted first using all the 
genes of each signature, then using subgroups of only up- or down- 
regulated genes. As shown in Supp. Table S4, the PTPN11 signature 
was found to be enriched in genes encoding proteins with SH2 do- 
mains (P < 0.001) and tyrosine-specific protein kinases (P < 0.01). 
The SOSi signature did not show significant enrichments, whereas 
the downregulated genes of the SHOC2 signature were strongly en- 
riched for genes having regulatory role in transcription (P < 10~ 7 ). 
These results prompted additional data mining focused on genes 
implicated in signal transduction and transcriptional control. The 
three signatures were therefore assessed for significant enrichment 
in kinase targets via the web-based "Kinase Enrichment Analysis" 
tool [Lachmann et al., 2009] (Table 1). Interestingly, this analysis 
documented that the PTPN11 signature displayed highly signifi- 
cant enrichment in targets of tyrosine kinases, particularly SRC 
family kinases (FYN, LYN, LCK, SRC) and SRC family interacting 
kinases (CSK, SYK, ZAP70). Despite its small size, the SOSI signa- 
ture displayed significant enrichment in substrates of LCK, while 
the SHOC2 signature was enriched in targets of MAPK and SRC 
family members and their interacting kinases. These data show that 
a significant percentage of genes transcriptionally modulated by NS 
and NS/LAH disease-causing alleles are themselves known targets 
of tyrosine kinases involved in signal transduction. 

Subsequently, we focused on protein-protein interactions, using 
the Gather [Chang and Nevins, 2006] and Genes2networks [Berger 
et al., 2007; see Supp. Methods] web-based tools. Of note, the 
only protein displaying significant interactor enrichment in both 
PTPN11 and SHOC2 signatures with both tools was CBL, recently 
found to be mutated in a condition partially overlapping NS [Mar- 
tinelli et al., 2010] . These data suggest that protein-protein interac- 
tion in silico analysis of gene expression signatures referred to the 
different RASopathies can represent an informative tool to identify 
new candidate disease genes for these disorders. 

The fact that the SHOC2 mutation was found to downregulate 
a large number of transcription factors (TFs) prompted us to an 
in-depth analysis of circuits of transcriptional regulation within the 



Table 1. Enrichment of the PTPN11, S0S1, and SH0C2 Signatures 
for Substrates of Kinases 





Kinase 


Number of substrates 
in signature 


Enrichment 
P-value 


PTPNl I signature 


INSR 


9 


4.09E-03 




PDGFRB 


5 


5.81E-03 




ERBB3 


6 


1.57E-03 




ERBB4 


4 


8.43E-03 




SRC 


12 


2.53E-03 




LCK 


13 


3.72E-06 




FYN 


12 


2.10E-04 




LYN 


10 


2.28E-04 




SYK 


7 


1.15E-03 




CSK 


5 


2.31E-03 




ZAP70 


5 


3.49E-03 




ITK 


4 


3.31E-03 




BTK 


5 


6.64E-03 




AXL 


3 


9.35E-03 


SOSI signature 


LCK 


5 


1.84E-03 




PRKAA2 


2 


2.81E-03 


SHOC2 signature 


PDGFRB 


12 


6.23E-03 




SYK 


17 


1.19E-03 




CSK 


10 


9.09E-03 




FYN 


26 


9.19E-03 




ZAP70 


11 


6.67E-03 




ITK 


8 


7.00E-03 




MAPK 11 


6 


6.40E-03 




MAPK 14 


52 


3.97E-04 



Gene lists from each of the three signatures were tested on the KEA web-based tool for 
enrichment in substrates of kinases. The table reports only kinases whose substrates 
were significantly enriched (P< 0.01). 



NS and NS/LAH signatures. To this aim, we searched for cases of 
concomitant presence within the same signature of TFs and their 
predicted targets. The results of this analysis, performed by the 
Opossum tool [Ho Sui et al., 2005], highlighted four cases of con- 
comitant and significant coregulation (Supp. Table S5 and Supp. 
Methods). In PTPNl i-mutated samples, GFI1 was negatively reg- 
ulated with respect to controls (P < 0.001) and its targets were 
preferentially downmodulated. In SOS2-mutated samples, GABPA 
was significantly upregulated (P < 0.01) and its targets were prefer- 
entially downregulated. Finally, SHOC2-mutated samples displayed 
higher expression of CREB1 (P < 0.001) and SP1 (P < 0.001), while 
the respective targets were preferentially downregulated. Overall, 
these results indicate the presence of at least one transcriptional 
regulation circuit in each signature (Fig. 3). 

Discussion 

Transcriptome analysis is a key tool to explore biological com- 
plexity of human diseases. We applied this approach to RASopathies 
with the aim of finding molecular correlates of the mutational status 
in PBMCs, focusing on the two genes most frequently mutated in 
NS, PTPNl 1, and SOSI, and on SHOC2, which has been recently 
discovered to cause NS/LAH, a disorder with clinical overlap with 
the former [Cordeddu et al., 2009]. 

When grouped together and compared to age-matched unaf- 
fected individuals, NS and NS/LAH-derived samples yielded a tran- 
scriptional signature of 123 genes that correctly classified most sam- 
ples. Such a signature, however, was not representative for the sam- 
ples heterozygous for a mutated SOSI allele and a fraction of sub- 
jects with mutations in PTPNl 1. When the overall cohort of NS 
patients was subdivided on the basis of the genetic lesion in the 
three gene-specific subgroups, larger and more homogeneous sig- 
natures emerged despite the lower sizes of subgroups. These results 
show that, although germline mutations in PTPNl 1, SOSI, and 
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Figure 3. Putative circuits of transcriptional regulation in PBMCs 
from subjects with NS and NS/LAH. The four drawings summarize the 
results of transcription factor and binding site analysis conducted on 
lists of genes from the PBMC transcriptional signatures characterizing 
the PTPN11, S0S1, and SH0C2 mutation groups. In each drawing, the 
oval reports, the mutated gene driving the signature, and the two links 
indicate concomitant and significant regulation in the same signature of 
a transcription factor (top link) and of its putative target genes (bottom 
link). 



SHOC2 deregulate the RAS-MAPK pathway, each mutated gene 
drives specific perturbations in intracellular signaling leading to 
different transcriptomic changes. Leave-one-out analysis confirmed 
that such gene-specific signatures correctly classified most NS and 
NS/LAH patients, which opens the way to potential clinical diag- 
nostic application of this approach. Partial overlap was observed 
between the PTPN1 1 mutation-associated transcriptome and in a 
mutually exclusive manner, those associated with SOS J and SHOC2 
gene mutations, allowing to define two PTPN1 1 subgroups, whose 
biological significance remains to be elucidated. SOS1 and SHOC2 
mutations appeared to drive anticorrelated transcriptional changes. 
Interestingly, there was significantly more transcriptome perturba- 
tion in SiTOC2-mutated specimens (1,394 genes) as compared to 
PTPN11 and SOSi-mutated samples (223 and 73 genes, respec- 
tively). Within the PTPN11 subgroup, no significant associations 
were found between transcriptional profiles and the clinical scoring 
system developed by van der Burgt [van der Burgt et al., 1994], 
possibly due to the small number of cases analyzed. 

The differences highlighted by transcriptional profiling were 
found to be consistent with the different role of SHP2, SOS1, and 
SHOC2 in modulating intracellular signaling. SHP2 is a nonrecep- 
tor protein tyrosine phosphatase [Neel et al., 2003] required for 



efficient activation of growth factor-induced RAS-MAPK signaling 
via multiple potential mechanisms [Dance et al., 2008]. Moreover, 
besides the positive modulatory role on RAS signaling, SHP2 con- 
trols additional signal transduction pathways, as those linked to 
STAT and SRC proteins that are well known to contribute signifi- 
cantly to transcriptional control [Grossmann et al., 2009; Xu and Qu, 
2008; Zhang et al., 2004] . SOS1 has instead a narrower role, being 
a bifunctional guanine nucleotide exchange factor (GEF) for RAS 
and RAC [Nimnual and Bar-Sagi, 2002]. This difference in func- 
tion together with the possibility of a cell context specificity of the 
perturbing effect of mutations on intracellular signaling could ex- 
plain the only partial overlap between the signatures characterizing 
the SOS1- and PTPN1 J-mutation groups. Of particular relevance 
is the fact that the invariant c.4A>G (p.Ser2Gly) SHOC2 mutation 
was observed to provoke a profound alteration in the PBMC tran- 
scriptome. SHOC2 encodes a widely expressed protein supposed to 
be required for efficient RAF1 activation following growth factor 
stimulation by promoting membrane translocation of the catalytic 
subunit of protein phosphatase 1 (PP1C) that is required for sta- 
ble RAF1 binding to RAS [ Rodriguez- Viciana et al., 2006]. The 
invariant missense change was demonstrated to introduce an N- 
myristoylation site that causes stable translocation to the plasma 
membrane of the mutated protein and enhanced ERK1/2 phospho- 
rylation in a cell context-dependent fashion [Cordeddu et al., 2009] . 
Being a scaffold protein permanently anchored at the plasma mem- 
brane, myristylated SHOC2 may exert still uncharacterized actions 
leading to massive transcriptional deregulation. Intriguingly, 110 
of the 225 genes composing the PTPN11 signature are also present 
and concordant in the SHOC2 signature, but most of the remaining 
several hundreds of the SHOC2 signature genes display a SHOC2- 
specific behavior. This finding strongly suggests that SHOC2 might 
control not only the RAS-MAPK axis, but also other signaling path- 
ways and/or cellular processes. Consistent with the present findings, 
it was demonstrated that SHOC2 translocates in the nucleus fol- 
lowing growth factor stimulation [Cordeddu et al., 2009], which 
supports the idea of a possible direct involvement of this protein in 
the control of processes linked to gene expression. 

Another striking finding of this work is the opposite sign of regu- 
lation of SOS1 target genes in the SHOC2 mutation group, and vice 
versa. In this case, despite the fact that both gene products in prin- 
ciple positively regulate the RAS-MAPK axis, gene-specific features 
of signal transduction apparently drive opposite transcriptional re- 
sponses. A possible explanation for this paradox is that aberrant sig- 
naling by a mutated gene can be counteracted by negative feedback 
loops that under particular circumstances may account for most 
of the transcriptional changes observed at the steady-state level. 
According to this view, the PTPN11, SOS1, and SHOC2 mutation- 
associated transcriptomes may be considered not only to directly 
report the grade of activity of the RAS-MAPK axis, but also high- 
light a more complex transcriptional circuitry that in some cases 
may result in opposite changes [Amit et al., 2007] . 

Downstream of the affected signaling pathways, NS and NS/LAH 
gene mutations ultimately drive functional alterations that result 
in clinically observable phenotypic traits. Indeed, by looking at the 
functions of the proteins encoded by genes included in the various 
above-mentioned signatures, we could reconstruct at least some of 
the regulatory circuits potentially involved in the molecular patho- 
genesis of these disorders. Basic functional keyword enrichment 
analysis revealed that many of the genes regulated by PTPN1 1, SOS1, 
and SHOC2 mutations are themselves involved in signal transduc- 
tion and control of transcription. Subsequent deeper analyses fo- 
cused on these features highlighted interesting properties of the 
transcriptional targets of signaling pathways modulated by SHP2, 
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S0S1, and SH0C2. In particular, a higher than expected representa- 
tion of substrates of members of the SRC family of tyrosine kinases 
(FYN, LYN, LCK, SRC) was observed. This finding highlights a 
complex interplay between mutations in PTPN11, SOS1, and 
SHOC2, and this family of kinases known to be involved in signal- 
ing through the MAPK cascade [Zhang et al., 2004] and to regulate 
fundamental cellular processes such as growth, shape change, and 
migration in multiple cell lineages [Parsons and Parsons, 2004]. 
Based on these findings, it can be speculated that such an inter- 
play could be at the basis of mesenchymal alterations giving rise 
to skeletal, cardiac, and hemopoietic abnormalities observed in NS 
and other RASopathies. Through a different approach, based on 
mining protein-protein interaction databases, we found that a high 
fraction of PTPN11 and SHOC2 target genes encode proteins in- 
teracting with the E3 ubiquitin ligase, CBL. Intriguingly, germline 
CBL mutations have been recently found in a condition with clin- 
ical features partially overlapping NS and with predisposition to 
hematologic malignancies during childhood, as well as in diverse 
myeloproliferative disorders and myeloid leukemias as somatically 
acquired lesions [Martinelli et al., 2010; Niemeyer et al., 2010; Perez 
et al., 2010]. Altogether, these results reveal a highly integrated ge- 
netic program, whereby biochemical activation of the RAS-MAPK 
axis drives transcriptional regulation of a relevant subset of proteins 
involved in functionally related signaling networks. In this view, 
deeper exploration of transcriptome/interactome connections may 
highlight new candidate genes for RASopathies, not yet molecularly 
elucidated. 

Finally, we focused on TF/target gene circuits modulated by the 
PTPN11, SOS1, and SHOC2 mutations. The most interesting one 
involves GFI1 (Growth factor independence 1) that was negatively 
modulated in samples with mutated PTPN1 1 and whose predicted 
targets were concordantly downregulated in the same samples. No- 
tably, children with NS present increased risk of myeloproliferative 
disorder (MPD) [Kratz et al., 2005], and GFI1 loss of function 
has been documented to cause MPDs [Khandanpour et al., 2011]. 
This evidence suggests that GFI1 downmodulation could be causally 
linked to MPD susceptibility in NS. In principle, genes differentially 
expressed in PBMCs could also be regulated in other tissues, and, 
therefore, related to nonhematologic anomalies. As an example, 
altered gene expression in the blood has been found to correlate 
with Huntington's disease, a specific neurodegenerative autosomal 
dominant disorder [Runne et al., 2007]. Indeed, GFI1 is also in- 
volved in the development of the inner ear hair cells [Moroy T, 
2005], and its mRNA was robustly downregulated in two indepen- 
dent murine models of hearing loss [Hertzano et al., 2004, Lewis 
et al., 2009]. In this respect, PTPN11 mutation-driven GFI1 down- 
regulation could play a key role in hearing abnormalities observed 
in NS [Scheiber et al., 2009; Qiu et al.,1998]. In the SHOC2 signa- 
ture, two TFs, CREB1 and SP1 were consistently upregulated, while 
their targets resulted to be preferentially downmodulated. CREB1 
encodes a 43-kDa basic/leucine zipper (bZIP) TF known to be a tar- 
get of the MAPK/ERK pathway [Morgan et al., 2001] . Interestingly, 
hippocampi deriving from a knockout mouse model of neurofibro- 
matosis presented increased activity of the RAF-ERK axis and of 
CREB [Guilding et al., 2007], indicating a possible involvement in 
the pathogenesis of cognitive impairment observed in RASopathies. 
SP1 belongs to the SP/KLF TF family and is a MAPK target [Be- 
nasciutti et al., 2004; Curry et al., 2008] . Interestingly, enhanced SP1 
activity has been linked to cardiac hypertrophy, a recurrent cardiac 
anomaly in NS [Azakie et al, 2006; Hu et al, 2010; Lin et al, 2009]. 
Finally, the putative GABPA circuit detected in the SOS1 signature is 
consistent with the fact that GABPA is a known target of the MAPK 
pathway [Flory et al., 1996; Fromm and Burden, 2001]. Altogether, 



functional data mining focused on signal transduction and TF ac- 
tivity highlighted genes and modules of transcriptional regulation 
present in the PTPN11, SOS1, and SHOC2 signatures that provide 
useful hints on the molecular pathogenesis of NS. 

It is likely that current advances in massive sequencing will pave 
the way to molecular characterization of all germline mutations 
causing RASopathies. In this perspective, the clinical potential of 
transcriptional NS signatures will not reside as much on first di- 
agnosis applications, but rather on its utility as a transcriptional 
readout of the actual functional status of the affected tissue. In this 
view, the results shown here open the way to exploit PBMC gene 
signatures as surrogate markers of specific MAPK pathway activa- 
tion driven by NS gene mutations and, therefore, as a powerful tool 
to monitor the biological response to molecular targeted drugs. 
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